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•  US  Air  Force  Policy  and  Background 

•  Defensible  Test  Approach 

•  Case  Study:  Digital  Engine  Control  Logic  Upgrade 

•  Test  Item  Description 

•  Test  Objective 

•  Historical  Approach 

•  New  Defensible  Approach 

•  Challenges 

•  Conclusions 
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%.</  Bottom  Line  Up  Front 

-  afftc 

•  Test  is  a  sc/ence...not  an  art 

-  Talented  engineers  but  limited  knowledge  in  test  design 

-  Determining  statistical  confidence  and  power  allows  mathematically 
defensible  conclusions 

•  Air  Force  decisions  are  too  important  to  be  left  to  professional 
opinion  alone. ..our  decisions  should  be  based  on  mothemoticol  fact 


“I  contend  that  all  experiments  are  designed.  Some  are  designed  by  intuition 
and  gut  feel.  Other  experiments  ...  according  to  a  rigorous  statistical  protocol 
....In  either  case,  experiments  are  designed.” — Gregory  Alexander 

_  _ f? 


AF  Policy 


AFFTC 


•  US  Air  Force  policy  for  operational  testing  requires  use  Design  of 
Experiments  (DOE)  as  a  discipline  to  improve  the  planning,  execution, 
analysis  and  reporting.  Policy  states: 

-  "Whenever  possible,  operational  evaluations  must  include  a  rigorous 
assessment  of  the  confidence  level  of  the  test,  the  power  of  the  test 
and  some  measure  of  how  well  the  test  spans  the  operational 
envelope."1-2 

•  Air  Force  leadership  -  "Encourages  use  of  DOE  to  increase  developmental 
test  rigor " 

•  Updating  Defense  Acquisition  Guidebook  to  apply  DOE  when  developing 
test  strategies 


Currently,  no  formal  policy  requires  DOE  use  in  developmental  testing 


'DOT&E  Memo,  May  2009,  Subject:  Test  and  Evaluation  Initiative  to  Apply  DOE  Across  Entire  Acquisition  Development  Cycle 
httDs://acc.dau.mil/CommunitvBrowser.aspx?id=312213&lana=en-US6 
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2DOT&E  Memo24  Nov  09,  Subject:  Test  and  Evaluation  Initiatives 
http://www.dote.osd. mil/about/TE  Initiatives. odf 


V  AFFTC’S  Defensible  History  ® 

— - - -  AFFTC 


•  AFFTC  Technical  Advisor  started  effort  in  2005 

-  Early  training  from  46th  Test  Wing  (Operational  Test  -  OT)  at  Eglin  AFB 

-  AFFTC  directive  in  2007  (...implement  Scientific  Methods  ) 

•  Potential  BUT... 

-  Over  simplified  operational  examples,  hard  to  argue  with,  but 
application  and  assumptions  for  developmental  application  unclear 

•  AFFTC  Propulsion  test  history 

-  Tried  to  implement  statistical  approaches  (Student-t,  uncertainty,  etc...) 
to  compare  test  results,  limited  success 

-  Slow  progress  improving  methods 

-  Now  regularly  working  with  AFFTC  statistics  group  and  AEDC 
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V  Defensible  Test  Approach 
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•  Defensible  testing  is  a  statistical  approach  but  also  emphasizes  the  need 
for  better  test  planning  by: 

-Understanding  the  system  under  test 

-Defining  clear  and  achievable  test  objectives 

-Ensuring  performance  metrics  are  observable  and  measurable 

-Instrumentation  accuracy  and  uncertainty  propagation  is  well  understood. 

•  Statistical  approach  determines  acceptable  Power  and  Confidence 

-  Power  is  the  probability  that  the  test  will  capture  a  difference  between  two  data 

sets  if  a  difference  exists 

-  Confidence  is  the  probability  that  a  prediction  is  correct 


Following  case  study  highlights  development  of  statistical  approach 
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Defensible  Testing 


me 


Case  Study  -  Engine  Control  Upgrade 


4  j  Defensible  Case  Study - 
V  Digital  Engine  Control  Upgrade 

Controller  modified  to  improve  stall  margin  in  heart  of  A/C  flight  envelope 


AFFTC 


Key  logic  changes: 

•  Compressor  variable  vane 
camber  was  scheduled  several 
degrees  more  closed 

•  Logic  was  activated  and 
deactivated  based  on  aircraft 
Mach,  PT2,  TT2 

•  Only  active  at  high  throttle 
settings 
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AFFTC 


•  Overall  Objective:  Evaluate  engine  stability  and  thrust 
response  with  revised  engine  control  logic  in 
comparison  to  the  legacy  engine  control  logic 


•  Specific  Objective:  Evaluate  revised  afterburner 
transient  capability,  specifically  time-to-light  and 
time-to-MAX  and  compare  to  legacy  results 
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AFFTC 


Test  Matrix 


o 

T3 

3 


< 


Mach  Number 


•Test  matrix  developed  to  evaluate  thrust  response  and  engine  stability,  focusing  on 
the  most  challenging  flight  conditions  and  logic  implementation  areas 


10 


Defensible  Case  Study- 
Old  -  Comparison  Approach 


Historical  Approach  -  Time  history  plot  comparison  IDLE-MAX 


Elapsed  Time 
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Defensible  Case  Study- 
Old  -  Comparison  Approach 


AFFTC 
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Historical  Approach  -  Time  to  thrust  comparison  with  3a  error  bars 


•  Past  analysis  compared  thrust  response  times 
all  IDLE-MAX  throttle  transients 

•  Grouping  results  masked  effect  from  logic 

•  Error  bars  didn’t  provide  additional  insight 

•Final  conclusion  :  “In  general,  afterburner  light- 
off  time  and  time  to  maximum  afterburner 
operation  were  comparable  to  the  legacy  logic.  ” 


Configuration 
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New  Defensible  Techniques 
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Defensible  Case  Study- 
New  -  Defensible  Objective 
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•Old  Specific  Objective;  Evaluate  revised 
afterburner  transient  capability,  specifically 
time-to-light  and  time-to-MAX  and  compare 
to  legacy  results 

•New  Specific  Objective:  Determine  with 
statistical  confidence  if  the  revised  logic 
thrust  response  has  degraded  in  the  stall 
avoidance  flight  regime  as  compared  to  the 
legacy  logic 
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Time  to  Stable  Afterburner 


Defensible  Case  Study- 
New  Defensible  Techniaues 
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Liner  Scale  Plot 


Log  Scale  Plot 


Experience  says  thrust  response  function  of  engine  face  total  pressure  (PT2) 
Power  (Log-Log)  transform  allows  model-based  statistical  analysis 
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Defensible  Case  Study- 
Analysis  using  Inferential  Model 


affk 


•  System  was  modeled  such  that  "Logic"  was  an  independent 
variable 


If  Logic  =  0,  then  the  model  is 
If  Logic  =  1,  then  the  model  is 


HTimet°*B  Stable)  =  A  +  A  ^  {PT  l) +  (0) 

=  A  +  A  ln(*T2) 

Stable  )=/}„  + fiMPT  2)+A(l) 

=  (A  +  A)+Aln(^2) 

=  a0  +  A  ln(/T2) 


-  If  coefficient  |32  is  statistically  significant  there  is  evidence  that 
there  is  a  difference  between  legacy  and  revised  results 

-  |32  is  the  parallel  shift  difference  of  legacy  model  to  the  revised 
model. 
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Defensible  Case  Study- 
Parallel  Lines  Model 


Statistical  Results:  (Inside  Stall  Avoidance  Region) 

•  The  t-test  for  the  third  coefficient  shows  a  very^mall  p-value 

(istsTi 


iting  statistical 


confidence  that  a  difference  in  thrust  response  exists  from  the  legacy  to  revised  logics 
•  Logarithmic  units(^.26lS>A/hich  translates  to  median  time  increase  of  30  r\ct  with  the 
revised  logic 


Estimate 

Std.  Error 

t  value 

p-value 

Pol 

2.85750 

0.23050 

12.397 

2.77e-09 

61 1 

-0.54009 

0.09095 

-5.938 

2.72e-05 

p2  ^ 

^0.26180) 

0.05946 

-4.403  < 

^000514) 

Interpreting  the  size  of  p-value 


Reference:  The  Statistical  Sleuth 


p- Value 


Is  there  evidence  of  a  difference? 
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1  j  Defensible  Case  Study - 

V _ Parallel  Lines  Model 

Statistical  Results:  (Outside  Stall  Avoidance  Region) 


AFFTC 


•  The  t-test  for  the  third  coefficient  shows  a  very  (small  p-valu| 


iting  statistical 

confidence  that  a  difference  in  thrust  response  exists  from  the  legacy  torfeyised  logics 


•  Logarithmic  units(jT142 
revised  logic 


hich  translates  to  median  time  increase  of  15  pet  with  the 


Estimate 

Std.  Error 

t  value 

p-value 

Pel 

2.54658 

0.07393 

34.447 

<  2e-16 

-0.45034 

0.04321 

10.422 

<  2e-16 

p2^ 

^0.1423|) 

0.03883 

-3.667  < 

^000946> 

•  Outside  Region  should  be  no  difference.  May  have  been  caused  by  engine  to 
engine  variation,  aircraft  installation  effects,  variations  in  flight  conditions,  or  an 
actual  difference  caused  by  the  engine  control  software 


Baseline  testing  with  same  engine  and  aircraft  may  eliminate  this  uncertainty 
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Defensible  Case  Study - 
A  few  words  about...  Power 


0m 


Definition:  Power  is  the  probability  that  the  test  will  capture  a  difference  between 
two  data  sets  if  a  difference  exists 


To  capture  a  difference  of  ~  30pct 


Power1 

Sample  Size  @ 
Measured  Std  Error 

0.98 

18 

0.80 

10 

To  capture  a  difference  of  ~  lOpct 


Power1 

Sample  Size  @ 
Measured  Std  Error 

0.80 

46 

Note:  80-percent  Power  sufficient  when  failure  not  life 
threatening  or  causes  significant  financial  burden 


Power  is  significontly  affected  by  magnitude  of  difference  trying  to  detect 
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'  J  Defensible  Risks/Challenges 

—  AFFTC 

•  Practical  issues  with  sample  size 

-  Typical  programs  don't  have  enough  time  or  $$$  to  execute  a 
statistically  relevant  test 

-  Early  tester  involvement  needed  to  influence  test  approach 

•  Confounding  variables  (life  degradation,  manufacturing 
tolerances,  installation  effects) 

-  Test  1  engine  but  making  fleet  decisions 

•  Classical  aspects  of  DOE  require  randomization  during  execution 

-  Safety  often  requires  incremental  build-up  for  envelope  expansion 

•  Engineers  are  NOT  statisticians 

-  It  is  extremely  easy  make  erroneous  conclusions  with  applied 
statistics 


Need  to  better  understand  these  risks/challenges 
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Summary 
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•  AF  moving  towards  policy  requiring  use  of  defensible 
test  techniques 

•Defensible  test  considerations  include 

-Test  planning  that  addresses  confidence,  power,  and 
performance  threshold 

-Test  objectives  need  to  be  clear  and  metrics  measurable 

-  Statistical  methods  vary  and  require  quality  data 

-  Need  to  address  risks/challenges  identified 
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